PloidyNGS was developed for a visual exploration of genome ploidy levels using Next Generation Sequencing data.
The main software is explorePloidyNGS.py, which is used to generate a graphic plotting the distribution of allele proportions across heterozygous positions in the genome, assuming biallelic positions. It assumes that distribution of allele proportion in these positions reflects the ploidy level of the organism under study. For example, one would expect most heterozygous positions with 50% of each allele in diploid genomes, whereas proportions of 33.3% and 66.6% would be expected for a triploid genome.
The software was tested with simulated and real data of different strains of the budding yeast Saccharomyces cerevisiae with the haploid genome of S. cerevisiae s288c as reference.
$ explorePloidyNGS.py --out outTable.txt --bam map_masked_genome.sorted.bam --genome masked_genome.fasta
Inputs:
* Mapping step can be carried out using e.g. Bowtie2
** A masked genome sequence can be obtained using e.g. RepeatMasker
explorePloidyNGS.py outputs a table (outTable.txt in the example above) with percentage of each observed alleles in each position (A,T,C or G) and a graphic (see simulated data below) that allows for a visual exploration of the ploidy by observing peaks of observed proportion of alleles in heterozygous positions.
Check the documentation (README) for other options.
The software simulatePloidyData.py was developed to generate simulated genomes with different ploidy and heterozigosity levels.
Here is an example of how the software is used:
$ simulatePloidy.py --genome genome.fna --ploidy 3 --heterozygosity 0.01
genome.fna is an unmasked haploid genome sequence, such as Saccharomyces cerevisiae s288c
| Ploidy level | Seq. Coverage | Heterozigosity level | |||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0.0001 | 0.001 | 0.01 | 0.1 | ||||||||||||||||||||||
| 2 |
|
|
|
|
|
||||||||||||||||||||
| 3 |
|
|
|
|
|
||||||||||||||||||||
| 4 |
|
|
|
|
|
||||||||||||||||||||
| 5 |
|
|
|
|
|
||||||||||||||||||||
| 6 |
|
|
|
|
|
||||||||||||||||||||
| 7 |
|
|
|
|
|
||||||||||||||||||||
Renato Augusto Correa dos Santos and Diego Mauricio Riano Pachon. PloidyNGS: Visually exploring ploidy with Next Generation Sequencing Data (submitted to Bioinformatics Oxford in June/2016)
| Ploidy level | 3 |
| Heterozigosity | 0.1 |
| Sequencing coverage | 25 |
| Ploidy level | 2 |
| Heterozigosity | 0.01 |
| Sequencing coverage | 25 |
| Ploidy level | 4 |
| Heterozigosity | 0.1 |
| Sequencing coverage | 100 |
| Ploidy level | 7 |
| Heterozigosity | 0.01 |
| Sequencing coverage | 25 |
| Ploidy level | 6 |
| Heterozigosity | 0.1 |
| Sequencing coverage | 15 |
| Ploidy level | 5 |
| Heterozigosity | 0.001 |
| Sequencing coverage | 15 |
| Ploidy level | 7 |
| Heterozigosity | 0.001 |
| Sequencing coverage | 100 |
| Ploidy level | 3 |
| Heterozigosity | 0.001 |
| Sequencing coverage | 15 |
| Ploidy level | 2 |
| Heterozigosity | 0.01 |
| Sequencing coverage | 15 |
| Ploidy level | 6 |
| Heterozigosity | 0.01 |
| Sequencing coverage | 15 |
| Ploidy level | 2 |
| Heterozigosity | 0.01 |
| Sequencing coverage | 50 |
| Ploidy level | 3 |
| Heterozigosity | 0.0001 |
| Sequencing coverage | 15 |
| Ploidy level | 4 |
| Heterozigosity | 0.1 |
| Sequencing coverage | 15 |
| Ploidy level | 4 |
| Heterozigosity | 0.01 |
| Sequencing coverage | 25 |
| Ploidy level | 5 |
| Heterozigosity | 0.1 |
| Sequencing coverage | 25 |
| Ploidy level | 4 |
| Heterozigosity | 0.0001 |
| Sequencing coverage | 50 |
| Ploidy level | 4 |
| Heterozigosity | 0.0001 |
| Sequencing coverage | 15 |
| Ploidy level | 7 |
| Heterozigosity | 0.01 |
| Sequencing coverage | 15 |
| Ploidy level | 7 |
| Heterozigosity | 0.001 |
| Sequencing coverage | 15 |
| Ploidy level | 5 |
| Heterozigosity | 0.01 |
| Sequencing coverage | 25 |
| Ploidy level | 6 |
| Heterozigosity | 0.001 |
| Sequencing coverage | 15 |
| Ploidy level | 3 |
| Heterozigosity | 0.001 |
| Sequencing coverage | 50 |
| Ploidy level | 4 |
| Heterozigosity | 0.01 |
| Sequencing coverage | 100 |
| Ploidy level | 6 |
| Heterozigosity | 0.1 |
| Sequencing coverage | 100 |
| Ploidy level | 2 |
| Heterozigosity | 0.0001 |
| Sequencing coverage | 100 |
| Ploidy level | 2 |
| Heterozigosity | 0.1 |
| Sequencing coverage | 25 |
| Ploidy level | 5 |
| Heterozigosity | 0.0001 |
| Sequencing coverage | 25 |
| Ploidy level | 3 |
| Heterozigosity | 0.1 |
| Sequencing coverage | 100 |
| Ploidy level | 2 |
| Heterozigosity | 0.001 |
| Sequencing coverage | 15 |
| Ploidy level | 7 |
| Heterozigosity | 0.01 |
| Sequencing coverage | 100 |
| Ploidy level | 7 |
| Heterozigosity | 0.1 |
| Sequencing coverage | 100 |
| Ploidy level | 2 |
| Heterozigosity | 0.01 |
| Sequencing coverage | 100 |
| Ploidy level | 4 |
| Heterozigosity | 0.001 |
| Sequencing coverage | 50 |
| Ploidy level | 5 |
| Heterozigosity | 0.001 |
| Sequencing coverage | 25 |
| Ploidy level | 5 |
| Heterozigosity | 0.001 |
| Sequencing coverage | 50 |
| Ploidy level | 3 |
| Heterozigosity | 0.01 |
| Sequencing coverage | 25 |
| Ploidy level | 3 |
| Heterozigosity | 0.01 |
| Sequencing coverage | 50 |
| Ploidy level | 6 |
| Heterozigosity | 0.001 |
| Sequencing coverage | 100 |
| Ploidy level | 6 |
| Heterozigosity | 0.0001 |
| Sequencing coverage | 100 |
| Ploidy level | 3 |
| Heterozigosity | 0.01 |
| Sequencing coverage | 100 |
| Ploidy level | 3 |
| Heterozigosity | 0.01 |
| Sequencing coverage | 15 |
| Ploidy level | 2 |
| Heterozigosity | 0.1 |
| Sequencing coverage | 50 |
| Ploidy level | 5 |
| Heterozigosity | 0.0001 |
| Sequencing coverage | 15 |
| Ploidy level | 4 |
| Heterozigosity | 0.01 |
| Sequencing coverage | 50 |
| Ploidy level | 6 |
| Heterozigosity | 0.0001 |
| Sequencing coverage | 25 |
| Ploidy level | 7 |
| Heterozigosity | 0.1 |
| Sequencing coverage | 25 |
| Ploidy level | 4 |
| Heterozigosity | 0.001 |
| Sequencing coverage | 25 |
| Ploidy level | 7 |
| Heterozigosity | 0.001 |
| Sequencing coverage | 50 |
| Ploidy level | 3 |
| Heterozigosity | 0.1 |
| Sequencing coverage | 15 |
| Ploidy level | 6 |
| Heterozigosity | 0.001 |
| Sequencing coverage | 25 |
| Ploidy level | 4 |
| Heterozigosity | 0.0001 |
| Sequencing coverage | 25 |
| Ploidy level | 2 |
| Heterozigosity | 0.1 |
| Sequencing coverage | 15 |
| Ploidy level | 3 |
| Heterozigosity | 0.0001 |
| Sequencing coverage | 50 |
| Ploidy level | 7 |
| Heterozigosity | 0.1 |
| Sequencing coverage | 50 |
| Ploidy level | 6 |
| Heterozigosity | 0.0001 |
| Sequencing coverage | 50 |
| Ploidy level | 3 |
| Heterozigosity | 0.001 |
| Sequencing coverage | 100 |
| Ploidy level | 7 |
| Heterozigosity | 0.01 |
| Sequencing coverage | 50 |
| Ploidy level | 2 |
| Heterozigosity | 0.0001 |
| Sequencing coverage | 25 |
| Ploidy level | 3 |
| Heterozigosity | 0.0001 |
| Sequencing coverage | 100 |
| Ploidy level | 3 |
| Heterozigosity | 0.0001 |
| Sequencing coverage | 25 |
| Ploidy level | 5 |
| Heterozigosity | 0.1 |
| Sequencing coverage | 50 |
| Ploidy level | 7 |
| Heterozigosity | 0.001 |
| Sequencing coverage | 25 |
| Ploidy level | 5 |
| Heterozigosity | 0.1 |
| Sequencing coverage | 15 |
| Ploidy level | 2 |
| Heterozigosity | 0.001 |
| Sequencing coverage | 100 |
| Ploidy level | 7 |
| Heterozigosity | 0.0001 |
| Sequencing coverage | 50 |
| Ploidy level | 5 |
| Heterozigosity | 0.0001 |
| Sequencing coverage | 100 |
| Ploidy level | 5 |
| Heterozigosity | 0.1 |
| Sequencing coverage | 100 |
| Ploidy level | 3 |
| Heterozigosity | 0.1 |
| Sequencing coverage | 50 |
| Ploidy level | 6 |
| Heterozigosity | 0.0001 |
| Sequencing coverage | 15 |
| Ploidy level | 7 |
| Heterozigosity | 0.1 |
| Sequencing coverage | 15 |
| Ploidy level | 5 |
| Heterozigosity | 0.001 |
| Sequencing coverage | 100 |
| Ploidy level | 4 |
| Heterozigosity | 0.001 |
| Sequencing coverage | 100 |
| Ploidy level | 6 |
| Heterozigosity | 0.1 |
| Sequencing coverage | 25 |
| Ploidy level | 4 |
| Heterozigosity | 0.1 |
| Sequencing coverage | 25 |
| Ploidy level | 5 |
| Heterozigosity | 0.01 |
| Sequencing coverage | 50 |
| Ploidy level | 6 |
| Heterozigosity | 0.01 |
| Sequencing coverage | 100 |
| Ploidy level | 2 |
| Heterozigosity | 0.001 |
| Sequencing coverage | 25 |
| Ploidy level | 6 |
| Heterozigosity | 0.01 |
| Sequencing coverage | 25 |
| Ploidy level | 4 |
| Heterozigosity | 0.0001 |
| Sequencing coverage | 100 |
| Ploidy level | 6 |
| Heterozigosity | 0.01 |
| Sequencing coverage | 50 |
| Ploidy level | 2 |
| Heterozigosity | 0.0001 |
| Sequencing coverage | 50 |
| Ploidy level | 4 |
| Heterozigosity | 0.01 |
| Sequencing coverage | 15 |
| Ploidy level | 2 |
| Heterozigosity | 0.1 |
| Sequencing coverage | 100 |
| Ploidy level | 2 |
| Heterozigosity | 0.001 |
| Sequencing coverage | 50 |
| Ploidy level | 4 |
| Heterozigosity | 0.001 |
| Sequencing coverage | 15 |
| Ploidy level | 6 |
| Heterozigosity | 0.1 |
| Sequencing coverage | 50 |
| Ploidy level | 3 |
| Heterozigosity | 0.001 |
| Sequencing coverage | 25 |
| Ploidy level | 7 |
| Heterozigosity | 0.0001 |
| Sequencing coverage | 25 |
| Ploidy level | 7 |
| Heterozigosity | 0.0001 |
| Sequencing coverage | 15 |
| Ploidy level | 2 |
| Heterozigosity | 0.0001 |
| Sequencing coverage | 15 |
| Ploidy level | 6 |
| Heterozigosity | 0.001 |
| Sequencing coverage | 50 |
| Ploidy level | 5 |
| Heterozigosity | 0.0001 |
| Sequencing coverage | 50 |
| Ploidy level | 5 |
| Heterozigosity | 0.01 |
| Sequencing coverage | 15 |
| Ploidy level | 7 |
| Heterozigosity | 0.0001 |
| Sequencing coverage | 100 |
| Ploidy level | 4 |
| Heterozigosity | 0.1 |
| Sequencing coverage | 50 |
| Ploidy level | 5 |
| Heterozigosity | 0.01 |
| Sequencing coverage | 100 |